136 research outputs found

    Blind Source Separation with Optimal Transport Non-negative Matrix Factorization

    Full text link
    Optimal transport as a loss for machine learning optimization problems has recently gained a lot of attention. Building upon recent advances in computational optimal transport, we develop an optimal transport non-negative matrix factorization (NMF) algorithm for supervised speech blind source separation (BSS). Optimal transport allows us to design and leverage a cost between short-time Fourier transform (STFT) spectrogram frequencies, which takes into account how humans perceive sound. We give empirical evidence that using our proposed optimal transport NMF leads to perceptually better results than Euclidean NMF, for both isolated voice reconstruction and BSS tasks. Finally, we demonstrate how to use optimal transport for cross domain sound processing tasks, where frequencies represented in the input spectrograms may be different from one spectrogram to another.Comment: 22 pages, 7 figures, 2 additional file

    Dual Gauss-Newton Directions for Deep Learning

    Full text link
    Inspired by Gauss-Newton-like methods, we study the benefit of leveraging the structure of deep learning objectives, namely, the composition of a convex loss function and of a nonlinear network, in order to derive better direction oracles than stochastic gradients, based on the idea of partial linearization. In a departure from previous works, we propose to compute such direction oracles via their dual formulation, leading to both computational benefits and new insights. We demonstrate that the resulting oracles define descent directions that can be used as a drop-in replacement for stochastic gradients, in existing optimization algorithms. We empirically study the advantage of using the dual formulation as well as the computational trade-offs involved in the computation of such oracles.Comment: Presented at the Duality Principles for Modern Machine Learning Workshop at ICML 202
    • …
    corecore